Customizing Floating-Point Operators for Linear Algebra Acceleration on FPGAs

نویسندگان

  • Bogdan Pasca
  • Florent de Dinechin
چکیده

Accelerating the execution of algorithms involving floating-point computations is currently a necessity in the scientific community. A solution – FPGAs – are believed to provide the best balance between costs, performance and flexibility. The FPGA’s flexibility can be best exploited when used to accelerate ”exotic operators”(log, exp, dot product) and operators tailored for the numerics of each application. These requirements gave birth to FloPoCo, a floating-point core generator written in C++ available under GPL at 1. The purpose of this work was to bring FloPoCo to maturity, by framework stabilization and operator implementation. In term of framework stabilization we have implemented a novel automatic pipeline generation feature. In term of implemented operators, our work includes the basic blocks of FloPoCo: IntAdder, IntMultiplier, FPMultiplier, DotProduct, Karatsuba multiplication, FPAdder, LongAccumulator. The obtained results are promising, ranking higher than the results obtained by FPLibrary 2 and not far from the operators generated with Xilinx CoreGen 3. However, we generate portable VHDL code which is target independent, while CoreGen work only with Xilinx FPGAs. Nevertheless, work is still needed to bring some operators to CoreGen level. We also studied the possibilities to implement interval arithmetic operators on FPGAs. We have proposed and discussed architectures for the four basic operations, for both infimum-supremum and midpoint-radius representations. We have also proposed an application for testing different trade-offs in terms of precision and size of these operators. 1http://www.ens-lyon.fr/LIP/Arenaire/Ware/FloPoCo 2http://www.ens-lyon.fr/LIP/Arenaire/Ware/FPLibrary/ 3http://www.xilinx.com/ipcenter/coregen

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Customizing floating-point units for FPGAs: Area-performance-standard trade-offs

Keywords: Floating-point arithmetic FPGAs Library of operators High performance The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable perfo...

متن کامل

FPGA Optimizations for a Pipelined Floating-Point Exponential Unit

The large number of available DSP slices on new-generation FPGAs allows for efficient mapping and acceleration of floating-point intensive codes. Numerous scientific codes heavily rely on executing the exponential function. To this end, we present the design and implementation of a pipelined CORDIC/TD-based (COrdinate Rotation DIgital Computer/Table Driven) Exponential Approximation Unit (EAU) ...

متن کامل

Parameterized floating-point logarithm and exponential functions for FPGAs

As FPGAs are increasingly being used for floating-point computing, the feasibility of a library of floating-point elementary functions for FPGAs is discussed. An initial implementation of such a library contains parameterized operators for the logarithm and exponential functions. In single precision, those operators use a small fraction of the FPGA’s resources, have a smaller latency than their...

متن کامل

Group-Alignment based Accurate Floating-Point Summation on FPGAs

Floating-point summation is one of the most important operations in scientific/numerical computing applications and also a basic subroutine (SUM) in BLAS (Basic Linear Algebra Subprograms) library. However, standard floating-point arithmetic based summation algorithms may not always result in accurate solutions because of possible catastrophic cancellations. To make the situation worse, the seq...

متن کامل

An Independent Analysis of Floating-point DSP Design Flow and Performance on Altera 28-nm FPGAs

OVERVIEW FPGAs are increasingly used as parallel processing engines for demanding digital signal processing applications. Benchmark results show that on highly parallelizable workloads, FPGAs can achieve higher performance and superior cost/performance compared to digital signal processors (DSPs) and general-purpose CPUs. However, to date, FPGAs have been used almost exclusively for fixed-point...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009